knowledge store
ClaimCheck: Real-Time Fact-Checking with Small Language Models
Putta, Akshith Reddy, Devasier, Jacob, Li, Chengkai
We introduce ClaimCheck, an LLM-guided automatic fact-checking system designed to verify real-world claims using live Web evidence and small language models. Unlike prior systems that rely on large, closed-source models and static knowledge stores, ClaimCheck employs a transparent, stepwise verification pipeline that mirrors human fact-checking workflows consisting of Web search query planning, Web-based evidence retrieval and summarization, evidence synthesis and re-retrieval, and claim verdict evaluation. Each module is optimized for small LLMs, allowing the system to deliver accurate and interpretable fact-checking with significantly lower computational requirements. Despite using a much smaller Qwen3-4B model, ClaimCheck achieves state-of-the-art accuracy of 76.4% on the AVeriTeC dataset, outperforming previous approaches using LLaMA3.1 70B and GPT-4o. Extensive ablations demonstrate that careful modular design and prompting strategies can overcome the limitations of smaller LLMs. To promote accessibility and transparency, we provide a public demo at https://idir.uta.edu/claimcheck.
- North America > United States > Florida > Miami-Dade County > Miami (0.05)
- Oceania > New Zealand (0.04)
- Asia > Middle East > Iraq (0.04)
- (2 more...)
FedRAG: A Framework for Fine-Tuning Retrieval-Augmented Generation Systems
Fajardo, Val Andrei, Emerson, David B., Singh, Amandeep, Chatrath, Veronica, Lotif, Marcelo, Theja, Ravi, Cheung, Alex, Matsuba, Izuki
Retrieval-augmented generation (RAG) systems have been shown to be effective in addressing many of the drawbacks of relying solely on the parametric memory of large language models. Recent work has demonstrated that RAG systems can be improved via fine-tuning of their retriever and generator models. In this work, we introduce FedRAG, a framework for fine-tuning RAG systems across centralized and federated architectures. FedRAG supports state-of-the-art fine-tuning methods, offering a simple and intuitive interface and a seamless conversion from centralized to federated training tasks. FedRAG is also deeply integrated with the modern RAG ecosystem, filling a critical gap in available tools.
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Singapore (0.04)
- Asia > Indonesia > Bali (0.04)
- (4 more...)
DistRAG: Towards Distance-Based Spatial Reasoning in LLMs
Schneider, Nicole R, Ramachandran, Nandini, O'Sullivan, Kent, Samet, Hanan
Many real world tasks where Large Language Models (LLMs) can be used require spatial reasoning, like Point of Interest (POI) recommendation and itinerary planning. However, on their own LLMs lack reliable spatial reasoning capabilities, especially about distances. To address this problem, we develop a novel approach, DistRAG, that enables an LLM to retrieve relevant spatial information not explicitly learned during training. Our method encodes the geodesic distances between cities and towns in a graph and retrieves a context subgraph relevant to the question. Using this technique, our method enables an LLM to answer distance-based reasoning questions that it otherwise cannot answer. Given the vast array of possible places an LLM could be asked about, DistRAG offers a flexible first step towards providing a rudimentary `world model' to complement the linguistic knowledge held in LLMs.
- Oceania > Australia > New South Wales > Sydney (0.29)
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Maryland > Prince George's County > College Park (0.05)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)
The Automated Verification of Textual Claims (AVeriTeC) Shared Task
Schlichtkrull, Michael, Chen, Yulong, Whitehouse, Chenxi, Deng, Zhenyun, Akhtar, Mubashara, Aly, Rami, Guo, Zhijiang, Christodoulopoulos, Christos, Cocarascu, Oana, Mittal, Arpit, Thorne, James, Vlachos, Andreas
The Automated Verification of Textual Claims (AVeriTeC) shared task asks participants to retrieve evidence and predict veracity for real-world claims checked by fact-checkers. Evidence can be found either via a search engine, or via a knowledge store provided by the organisers. Submissions are evaluated using AVeriTeC score, which considers a claim to be accurately verified if and only if both the verdict is correct and retrieved evidence is considered to meet a certain quality threshold. The shared task received 21 submissions, 18 of which surpassed our baseline. The winning team was TUDA_MAI with an AVeriTeC score of 63%. In this paper we describe the shared task, present the full results, and highlight key takeaways from the shared task.
- Asia > India (0.14)
- Asia > Philippines (0.05)
- Asia > Singapore (0.04)
- (11 more...)
- Media (0.68)
- Energy (0.68)
- Government > Regional Government > Asia Government > China Government (0.46)
AIC CTU system at AVeriTeC: Re-framing automated fact-checking as a simple RAG task
Ullrich, Herbert, Mlynář, Tomáš, Drchal, Jan
This paper describes our $3^{rd}$ place submission in the AVeriTeC shared task in which we attempted to address the challenge of fact-checking with evidence retrieved in the wild using a simple scheme of Retrieval-Augmented Generation (RAG) designed for the task, leveraging the predictive power of Large Language Models. We release our codebase and explain its two modules - the Retriever and the Evidence & Label generator - in detail, justifying their features such as MMR-reranking and Likert-scale confidence estimation. We evaluate our solution on AVeriTeC dev and test set and interpret the results, picking the GPT-4o as the most appropriate model for our pipeline at the time of our publication, with Llama 3.1 70B being a promising open-source alternative. We perform an empirical error analysis to see that faults in our predictions often coincide with noise in the data or ambiguous fact-checks, provoking further research and data augmentation.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Czechia > Prague (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (4 more...)
- Law Enforcement & Public Safety (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Media (0.68)
DANA: Domain-Aware Neurosymbolic Agents for Consistency and Accuracy
Luong, Vinh, Dinh, Sang, Raghavan, Shruti, Nguyen, William, Nguyen, Zooey, Le, Quynh, Vo, Hung, Maegaito, Kentaro, Nguyen, Loc, Nguyen, Thao, Ha, Anh Hai, Nguyen, Christopher
Large Language Models (LLMs) have shown remarkable capabilities, but their inherent probabilistic nature often leads to inconsistency and inaccuracy in complex problem-solving tasks. This paper introduces DANA (Domain-Aware Neurosymbolic Agent), an architecture that addresses these issues by integrating domain-specific knowledge with neurosymbolic approaches. We begin by analyzing current AI architectures, including AutoGPT, LangChain ReAct and OpenAI's ChatGPT, through a neurosymbolic lens, highlighting how their reliance on probabilistic inference contributes to inconsistent outputs. In response, DANA captures and applies domain expertise in both natural-language and symbolic forms, enabling more deterministic and reliable problem-solving behaviors. We implement a variant of DANA using Hierarchical Task Plans (HTPs) in the open-source OpenSSA framework. This implementation achieves over 90\% accuracy on the FinanceBench financial-analysis benchmark, significantly outperforming current LLM-based systems in both consistency and accuracy. Application of DANA in physical industries such as semiconductor shows that its flexible architecture for incorporating knowledge is effective in mitigating the probabilistic limitations of LLMs and has potential in tackling complex, real-world problems that require reliability and precision.
- Information Technology (0.46)
- Law (0.46)
Hybrid Context Retrieval Augmented Generation Pipeline: LLM-Augmented Knowledge Graphs and Vector Database for Accreditation Reporting Assistance
In higher education, accreditation is a quality assurance process, where an institution demonstrates a commitment to delivering high quality programs and services to their students. For business schools nationally and internationally the Association to Advance Collegiate Schools of Business (AACSB) accreditation is the gold standard. For a business school to receive and subsequently maintain accreditation, the school must undertake a rigorous, time consuming reporting and peer review process, to demonstrate alignment with the AACSB Standards. For this project we create a hybrid context retrieval augmented generation pipeline that can assist in the documentation alignment and reporting process necessary for accreditation. We implement both a vector database and knowledge graph, as knowledge stores containing both institutional data and AACSB Standard data. The output of the pipeline can be used by institution stakeholders to build their accreditation report, dually grounded by the context from the knowledge stores. To develop our knowledge graphs we utilized both a manual construction process as well as an LLM Augmented Knowledge Graph approach. We evaluated the pipeline using the RAGAs framework and observed optimal performance on answer relevancy and answer correctness metrics.
- North America > United States > Texas (0.14)
- North America > United States > Hawaii (0.04)
- North America > United States > Wisconsin (0.04)
- (6 more...)
- Overview (0.67)
- Research Report (0.66)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.68)
Retrieval-based Knowledge Transfer: An Effective Approach for Extreme Large Language Model Compression
Liu, Jiduan, Liu, Jiahao, Wang, Qifan, Wang, Jingang, Cai, Xunliang, Zhao, Dongyan, Wang, Ran Lucien, Yan, Rui
Large-scale pre-trained language models (LLMs) have demonstrated exceptional performance in various natural language processing (NLP) tasks. However, the massive size of these models poses huge challenges for their deployment in real-world applications. While numerous model compression techniques have been proposed, most of them are not well-suited for achieving extreme model compression when there is a significant gap in model scale. In this paper, we introduce a novel compression paradigm called Retrieval-based Knowledge Transfer (RetriKT), which effectively transfers the knowledge of LLMs to extremely small-scale models (e.g., 1%). In particular, our approach extracts knowledge from LLMs to construct a knowledge store, from which the small-scale model can retrieve relevant information and leverage it for effective inference. To improve the quality of the model, soft prompt tuning and Proximal Policy Optimization (PPO) reinforcement learning techniques are employed. Extensive experiments are conducted on low-resource tasks from SuperGLUE and GLUE benchmarks. The results demonstrate that the proposed approach significantly enhances the performance of small-scale models by leveraging the knowledge from LLMs.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Slovenia (0.05)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- (17 more...)
SingularityNET's 2022 Progress Towards AGI
Today, we'd like to share a special update, a deep dive into the AGI progress made by SingularityNET in 2022, and an overview of what makes OpenCog Hyperon -- SingularityNET's approach to an AGI framework -- different from other AI systems. As AI systems demonstrate greater practical functionality each year, it becomes increasingly apparent that the breakthrough from narrow AI to Artificial General Intelligence is near. However, there is still no agreement among researchers about how the breakthrough will be made. While deep neural networks have demonstrated impressive capabilities for impersonating intelligence and producing intelligent-looking artifacts, their complete lack of commonsense understanding and real-world symbol grounding makes it appear unlikely that they can serve as the core component of a true AGI system. It's possible that computational neuroscience simulations will make huge strides, or that AGI will spontaneously emerge from self-organizing networks like the SingularityNET Platform without coordinated planning -- but it seems more likely that some new innovation in cognitive architecture and/or learning and reasoning algorithms will be needed alongside these.
The Expertise Level
Computers are quickly gaining on us. Artificial systems are now exceeding the performance of human experts in several domains. However, we do not yet have a deep definition of expertise. This paper examines the nature of expertise and presents an abstract knowledge-level and skill-level description of expertise. A new level lying above the Knowledge Level, called the Expertise Level, is introduced to describe the skills of an expert without having to worry about details of the knowledge required. The Model of Expertise is introduced combining the knowledge-level and expertise-level descriptions. Application of the model to the fields of cognitive architectures and human cognitive augmentation is demonstrated and several famous intelligent systems are analyzed with the model.
- North America > United States > South Carolina > Spartanburg County > Spartanburg (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (3 more...)
- Health & Medicine (1.00)
- Leisure & Entertainment > Games > Chess (0.94)
- Education (0.68)